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We perform a thorough analysis on the choice of estimators for random series path integral meth- 
ods. In particular, we show that both the thermodynamic (T-method) and the direct (H-method) 
energy estimators have finite variances and are straightforward to implement. It is demonstrated 
that the agreement between the T-method and the H-method estimators provides an important 
consistency check on the quality of the path integral simulations. We illustrate the behavior of the 
various estimators by computing the total, kinetic, and potential energies of a molecular hydrogen 
cluster using three different path integral techniques. Statistical tests are employed to validate the 
sampling strategy adopted as well as to measure the performance of the parallel random number 
generator utilized in the Monte Carlo simulation. Some issues raised by previous simulations of the 
hydrogen cluster are clarified. 
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I. INTRODUCTION 

Numerical path integral methods have proved to be 
highly useful tools in the analysis of finite temperature, 
many-body quantum systemsi A central theme in such 
studies is the conscious use of dimensionality, both in the 
reformulation of the original problem and in the subse- 
quent numerical simulations. 

As the scale of the problems under study continues 
to grow, it becomes increasingly important that the for- 
mal properties of the numerical methods that are utilized 
be properly characterized. Recently, Predescu and co- 
workers^^ have presented a number of results concern- 
ing the convergence properties of random series-based 
path integral techniques. Important in their own right, 
these formal properties have also led to the development 
of a new class of path integral methods, the so-called 
reweighted techniques*^ Reweighted approaches acceler- 
ate the convergence of "primitive" series methods by in- 
cluding the effects of "higher-order" path variables in 
a simple, approximate fashion. Reweighted methods 
achieve the convergence rate of related partial averag- 
ing approaches^ without requiring the construction of the 
Gaussian transform of the underlying potential energy 
function. 

Previous work on the reweighted method has fo- 
cused principally on the construction of the quantum- 
mechanical density matrix^^ In the present work, we 
wish to examine estimators for various coordinate- 
diagonal and off-diagonal properties. While the present 
discussion is focused principally on reweighted methods, 
the results obtained are broadly applicable to more gen- 
eral random series approaches. 

In Section II of the present article, we examine the 
thermodynamic (T-method) and direct (H-method) esti- 
mators for the total energy. In order to avoid any con- 



fusion with earlier estimators, we mention that in the 
present article by T-method and H-method estimators we 
understand the respective energy estimators introduced 
by Predescu and Doll in Ref. I2- Thus, the T-method es- 
timator we employ does not have the variance difficulties 
associated with the Barker estimator for large numbers of 
path variables^ As the low-temperature simulation pre- 
sented in the second part of the article demonstrates, 
the present T-method estimator does not exhibit any of 
the difficulties sometimes associated with the virial esti- 
mator for low-temperature systems or for strongly cor- 
related Monte Carlo sampling techniques &2*i2iiLiS The 
T-method estimator is closely related and similar in form 
to the centroid virial estimator We expect the two 
estimators to have similar behavior with the nature of the 
quantum system, the temperature, and the Monte Carlo 
sampling method. However, an important difference be- 
tween the two estimators is the fact that the T-method 
estimator is a veritable thermodynamic estimator, in the 
sense that it is obtained by temperature differentiation of 
the quantum partition function. This observation is im- 
portant because the temperature differentiation can be 
implemented numerically by a finite-difference scheme 
and, in principle, may lead to numerically stable algo- 
rithms that do not require derivatives of the potential. 
For large dimensional systems or systems described by 
complicated potentials, we expect such algorithms to be 
significantly faster than those based on explicit analyti- 
cal formulas. The relative merits of such algorithms will 
be examined in future work. 

In Section III, we examine the application of the 
reweighted methods to a model problem, that of simulat- 
ing the thermodynamic properties of the {^2)22 molec- 
ular cluster. In Section IV, we summarize our present 
findings and clarify a number of issues raised in previous 
studies of this molecular hydrogen system>iSii& 



2 



II. ENERGY ESTIMATORS 



In this section, we consider a one-dimensional quan- 
tum canonical system characterized by inverse tempera- 
ture (3 = l/(k B T) and set forward the task of computing 
its average energy by Monte Carlo integration methods 
developed around several reweighted techniques4*& The 
physical system is made up of a particle of mass mo mov- 
ing in the potential V{x). We discuss the numerical im- 
plementation and the computational merits of both the 
T-method and H-mcthod estimators. Any time the mul- 
tidimensional extension is not obvious, we present the 
explicit formulas of the respective estimators. 

We begin by presenting the general form of the path 
integral methods we employ in this paper. We re- 
mind the reader that in terms of a standard Brownian 
motion {B u ,u > 0}, the Feynman-Kac, formula has the 
expression^ 

p(x, x'\ f3)=P [<jBi = x'\<tB = x] 

xE \e- ti v (*B u )du\ aBi = ^ aBo = 1 ( ^ 

where a = (h 2 (3 /mo) 1 / 2 . In this paper, we shall use the 
symbol E to denote the expected value (average value) of 
a certain random variable against the underlying proba- 
bility measure of the Brownian motion B u . It is straight- 
forward to see that the first factor of the product in 
Eq. 0} (which represents the conditional probability den- 
sity that the rescaled Brownian motion aB u reaches the 
point x 1 provided that it starts at the point x) is the 
density matrix of a free particle of mass mo 

P \oB\ = x'\uBq = x] = Pf P (x, x';/3). 

Moreover, rather than using the conditional expectation 
appearing in the second factor of Eq. Q, one usually 
employs a stochastic process {B®;0 < u < 1}, called 
a standard Brownian bridge^ii^ which is defined as a 
standard Brownian motion conditioned on the end points 
such that Bq = and B® = 0. In terms of the newly 
defined process, the Feynman-Kag formula reads 



p fp (x,x';f3) 



Eexp 



-p / V[x r (u) + aB° u ]du 
'o 



where x r (u) = x + (x' — x)u is a straight line connecting 
the points x and x' and is called the reference path. 

As discussed in Ref. 0, one of the most general con- 
structions of the standard Brownian bridge is given by 
the Ito-Nisio theorem^ Let {Afe(r)}fe>i be a system of 
functions on the interval [0, 1], which together with the 
constant function A (r) = 1, make up an orthonormal 
basis in L 2 [0, 1]. Let 



Afc(i) = / X k (u)du. 
Jo 

If f2 is the space of infinite sequences a = (ai, ct2, . . .) and 



dP[a) = JJ dp(a k ) 



(2) 



k=l 



is the probability measure on f£ such that the coordi- 
nate maps a — > a k are independent identically distributed 
(i.i.d.) variables with distribution probability 



dfi(ai 



1 



/2tt 



:e- Q '/ 2 da, 



then 



B° u {a) =^a fc A fe ( u ), 0<u<l; 



(3) 



(4) 



fc=i 



i.e., the right-hand side random series is equal in distribu- 
tion to a standard Brownian bridge. The notation B®(d) 
in Q is then appropriate and allows us to interpret the 
Brownian bridge as a collection of random functions of 
argument a, indexed by u. 

Using the Ito-Nisio representation of the Brownian 
bridge, the Feynman-Kac formula takes the form 



P(x,x';(3) 
p fp (x,x';P) 



dP[a] exp 



a^a k A k (u) 



■13 I V 



x r (u) 



k=l 



du 



(5) 



For a multidimensional system, the Feynman-Kag for- 
mula is obtained by employing an independent random 
series for each additional degree of freedom. 

A reweighted method constructed from the random se- 
ries X^^Li a fcA/c(«) is any sequence of approximations to 
the density matrix of the form4 



p fp (x,x';(3) 
x exp 



dp(a qn + p ) 



(3 / V 

o 



qn+p 

x r (u) + a ^2 akAn,k(u) 

k=l 



where q and p are some fixed integers, where 
A n ,fc(w) = Afc(n) if 1 < k < n, 

and where 

qn+p oo 
fc— n+l k— n+1 



du\, (6) 



(7) 



(8) 



In Eq. IjBJ , n indexes the sequence of reweighted approx- 
imations p„ w (x, x'\ f3), sequence that converges to the 
density matrix p(x,x';(3) in the limit n — ► oo. Remark 
that the approximation of index n actually utilizes qn +p 
variables for path parameterization. In the construction 
of a certain path, the first n functions K n ^ k (u) coincide 
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with the ones for the corresponding series representation, 
as shown by Eq. J7J). A number of (q— l)n+p additional 
functions are constructed so that to maximize the order 
of convergence of the reweighted approximation. Notice 
that if the resulting approximation has a convergence of 
order a as measured against n, then it has the same order 
of convergence when measured against the total number 
of variables qn + p, though the convergence constant is 
q a times larger. This explains why the number of addi- 
tional functions is chosen to scale linearly with n. For 
additional information, the reader is advised to consult 
Ref . H 

It is convenient to introduce the additional quantities 
X n (x, x', a; 0) and X a3 (x 1 x' , a; 0), which are defined by 
the expressions 

X n (x,x',a;0) = pf P (x,x';0) 

„ 1 qn+p 



x exp { — J V 



.(u) + a ^2 cL k k n ^(u) du \ (9) 



fe=i 



and 



Xecixjx' ,a; f3) = pf p (x,x';0) 



x exp \ — 13 J V 



c r (u) + a afcAfe(u) du>, (10) 
fe=i J 

respectively. With the new notation, Eq. © becomes 



p™(x,x';0) = / dP[a]X n (x,x',a;0), (11) 
Jn 



while the Feynman-Kac. formula reads 



p(x,x';0) = / dP[a]-X 0Q (a: > !B',a;/3). (12) 



The analytical expressions of the functions A n ^(u) de- 
pend on the nature of the reweighted techniques and are 
generally chosen to maximize the asymptotic convergence 
of the respective reweighted techniques. 4 To a large ex- 
tent, the specific form of these functions is not important 
for the present development, but the reader is advised to 
consult Refs. and |^for quadrature techniques and ad- 
ditional clarifications. 

The remainder of the present section is split into two 
parts. First, we discuss the problem of computing the en- 
semble averages of operators diagonal in coordinate rep- 
resentation. In particular, this resolves the problem of 
computing the average potential energy. Second, we con- 
sider the problem of evaluating the total energies (hence, 
also the kinetic energies) by means of the T-method and 
H-method estimators. 



A. Operators diagonal in the coordinate 
representation 

By definition, the ensemble average of an operator O 
diagonal in the coordinate representation is 



(O) = 



L p(x; (3)0(x)dx 



J R p(x;0)dx 



(13) 



The quantity p(x; [3) = p(x, x; f3) is the diagonal den- 
sity matrix. By convention, we drop the second variable 
of the pair (x,x') any time x = x' . For instance, we 
use X n (x, a; j3) instead of X n (x, x, a; 0). By means of 
Eq. fl!2jl. the average above can be recast as 



(O) 



/3 



J R dx hdPialXoofaaiPMx) 
lR dx /orf-P[a]Aoo(a;,a;/3) 



(14) 



This average can be recovered as the limit n — > oo of the 
sequence 



(O) 



P t _ Jg dx J n dP[a]X n (x, a; (3)Q(x) 
/ K dx J Q dP[a]X n (x, a; f3) 



(15) 



the terms of which are to be evaluated by Monte Carlo 
integration. The estimating function 0{x) appearing in 
the above formula is called the point estimating function 
of the operator O. 

An alternative to the point estimating function is 
the so-called path estimating function, the derivation of 
which is presented shortly. As demonstrated in Appendix 
A, the function 0{x) appearing in Eq. i|14|) can be re- 
placed by 0[x + aB®(a)], without changing the value of 
the average (0}g. That is, the equality 



(Oh 



J R dx J n dP[a]Xoo(a;, a; (3)0[x + crB°(a)} 
fw. d x In dP^Xoo^x, a; (3) 



is valid for all < u < 1. Averaging over the variable u, 
one obtains 



(O) 



13 



J R dx f n dP[a]Xoo(x, a; (3) 0[x + aB°(a)]du 
J R dx f n dP[a]Xoo(x,a; 0) 

(16) 

Eq. H16(l shows that the ensemble average of the opera- 
tor O can also be recovered as the limit n — * oo of the 
sequence 

th = Jr dx J n dP[a]X n (x, a; 0) Jp Q[x + <t£?° „ (a)]du 
{ ' p - n ~ J R dxJ n dP[a}X n (x,a;0) 

(17) 



where we have set 



qn+p 



fe=l 



for convenience of notation. 



4 



In the remainder of the present subsection, we dis- 
cuss the relative merits of the point and path estimators. 
We first consider which of (0)^ n and (O)^ is closer to 
(O)o for a given n assuming the averages given in Eqs. 
(|15|l and l|17(l are computed exactly. Let us notice that 
Eq. 115(1 can be put in the form 



The Cauchy-Schwartz inequality implies 



t 5 R dxp™{ X] p)Q{x) 
{ f R dxp™(x;0) 



The probability distribution 



Pn 



RW (x; I3)dx 



f R p™(x;f3)dx 



(18) 



represents the marginal distribution of the variable x re- 
garded as a random variable on the space R x CI, which 
is endowed with the probability measure 



X n (x, a; (3)dx dP[a] 
J R dx J Q dP[d]X n (x, a; (3) 



(19) 



The reweightcd techniques are designed so that the dis- 
tribution given by Eq. ((TBI) is as close as possible to the 
quantum statistical one, which is given by the expression 

p(x; (3)dx 



J R p(x; f3)dx 



In designing the reweighted techniques, one seeks 
to optimize the rate of convergence of the sequence 
p RW (x, x'\ (3) -> p(x, x'; (3) for all x and x 1 X 

For arbitrary u, the marginal distribution of x + 
aB® n (d) is usually different from the one given by 
Eq. (|18fl and is not optimized. With few notable ex- 
ceptions to be analyzed below, the points x + aB® n (d) 
for different u are not equivalent, and their probability 
distribution may differ significantly from the quantum 
statistical one. (However, as shown in Appendix A, they 
become equivalent in the limit n — > oo.) Therefore, espe- 
cially for those reweighted techniques having fast asymp- 
totic convergence, we expect the point estimator to be 
more rapidly convergent with n than the path estimator. 

An additional issue appearing in Monte Carlo compu- 
tations is the variance of the two estimating functions 
0{x) and J Q 0[x + aB^ n (d)]du. In the limit n — > oo, the 
variance of the point estimating function converges to 



fu^x J n dP[a]Aoo(a;,a;/3)0(a;) 5 



(0)\ 



J R dx J Q dP[a]X OD (x, a; (3) 
_ Jg dx J n dP^X^jx, a; 13) $1 Q[x + <jB°(a)} 2 du 
fm dx In dP[d]X oa (x, a; (3) 

while the variance of the path estimating function con- 
verges to 

/ R dx J n dPfflX^x, a; (3) {/* 0[x + aB a (a)}du} 2 a 
J R dx ^dP^X^x^P) 



0[x + aBl(a)]du\ < J 0[x + a B°{d)} 2 'du 



and shows that the variance of the path estimating func- 
tion is always smaller than that of the point estimating 
function. The actual decrease in the variance is not al- 
ways significant because the points x + aB®(d) for dif- 
ferent u are strongly correlated. Depending on the na- 
ture of the function 0(x), the variance decrease may not 
compensate the effort required to compute the average 
J 0[x + aB® n (d)]du. However, if the function 0(x) is 
the potential V(x), then the smaller variance of the path 
estimator is a desirable feature because the path average 
f Q V[x + aB® n (a)]du, which also enters the expression 
of X n {x, a; (3), is computed anyway. 

To summarize the findings of the present subsection, 
the point estimator provides a more accurate value but 
has a larger variance than the path estimator. We next 
ask if there are any methods for which one may construct 
an estimator providing the same values as the point es- 
timator but having the variance of the path estimator. 
More precisely, we seek methods for which there is a divi- 
sion = uq < i*i < . . . < u qn < u qn+ i — 1 such that the 
mesh maxo<i< 9n |iti+i — Ui\ converges to zero as n — > oo 

and such that the points jx + aB®. n (a); < i < q n + 1 j 

have the same marginal distribution as x. For such meth- 
ods, the expected value of the estimating function 



Y,0[x + aB° Uin (d)}(u l+1 -u t ) 



(20) 



i=0 



under the probability distribution given by Eq. <|19[) is 
an estimator satisfying the criteria outlined in this para- 
graph. 

There are two methods we employ in the present paper 
for which such an estimator exists. The first one, is the 
trapezoidal Trotter discrete path integral method (TT- 
DPI) obtained by the Trotter composition 

Pn T (x,x';f3) = J dxi ...J dx n p Q (x,xx; 



■ Po I x n , x ; 



n + 1 



(21) 



of the short-time approximation 



Po (x,x';P) = p fp {x,x';0)exp 



-a 



V(x) + V(x') 



It has been shownSfi that for n = 2 k - 1, the TT-DPI 
method admits the following implementation 
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p fp (x,x';(3) 

( 2 k 

x exp < —py^^ujjV 



k T 



■JTffr = / d a i,i ■ ■ ■ I da kj2k -i (2tt) " /2 exp -- ^ ^ , 



Z=l i=l 



i=0 



.(Ui) +&22 ■^ , i,[2 , - 1 «i]+l( u i) a J,[2'- 1 ti i ]+l 



(22) 



1=1 



where m = 2 fe i for < i < 2 k and 

f 2-( fc +D, if z e {0,2 fc }, 
1 \ 2- k , if 1 < i < 2 fe - 1. 

The functions Fi,k(u) are the so-called Schauder 
functionSfSi the definitions of which are presented in the 
cited references. We leave it for the reader to use Eq. 121|) 
and show that if x = x', then all the points x + aB®. n (a) 
have identical marginal distribution given by the formula 

In this case, the point and the path estimators produce 



identical results for the ensemble average of a diagonal 
operator O 

^pr{x-p)o{x)dx 

f M Pn T &P) dx 

At least for the ensemble average of the potential en- 
ergy, one should always use the path estimator, which 
has smaller variance. 

A second method for which there is an estimator giv- 
ing the same values as the point estimator but having 
(asymptotically, as n — > oo) the variance of the path es- 
timator is the so-called Levy-Ciesielski reweighted tech- 
nique (RW-LCPI) defined by the formula^ 



Pn C (*,*';/3) 

pf p (x,x';0) 



da i 



da 



k+2,2 k + 1 



(27T 



-(4n+3) /2 



k+2 2 



exp 



2 "'J 

1=1 3=1 



x exp < — P J V 



k+2 



c r {u) + <7^aj )[2 '-itt]+i ^,pi-i„]+i( w ) 



1=1 



du 



(23) 



where [2 i_1 w] is the integer part of 2 l ~ 1 u. It has been 
shown that for n = 2 k — 1, the RW-LCPI method can be 
put in the Trotter product form 4 



p n (x,x\f$)= / dx 1 ... dx n p [x,x 



n + 1 



■Pa 



LC 



J. P 



n + 1 



, (24) 



where 

P^ C (x,x';f3) 



p fp {x,x';/3) ( 27r ) 3 / 2 



D -(a 2 +4+a 2 )/2 



x exp < —ft V[x + (x' — x)u + a\aCo(u) 



+a2<rLo(' lt ) + ciz<jR Q {u)\du >daida 2 da. 



The analytical expressions of the functions (u), 



Lq(u), Rq(u), and Co (it) can be found in Refs. and 

D 

Again, we leave it for the reader to use Eq. (|24() and 
prove that if x' = x, then all the points 

fe+2 

x + a X! a J,[2'-iu 4 ]+l ^,[2*-i U4 ]+l( U *) 
1=1 

with m — 2~ k i for < i < 2 k have identical marginal 
distributions equal to that of x. The estimator 



2"-l 
i=0 



o 



k+2 



(25) 

produces the same results as the point estimator, but it 
has the variance of the path estimator. As far as the 
evaluation of the average potential energy is concerned, 
in order to avoid unnecessary calls to the potential rou- 
tine, it is desirable that the points {2~ k i;0 < i < 2 k } 
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be among the quadrature points utilized for the compu- 
tation of the path averages appearingin Eq. (|23[1 . The 
quadrature technique designed in Ref. J6J shares this prop- 
erty. As opposed to the TT-DPI method, the point and 
the path estimators for the RW-LCPI method produce 
different results. 



where the T- method estimating function E^ D (x, a; (3) can 
be shown to be^ 



B. Estimators for the total energy 



E*(x, a; (3) = ^ + / V [x + aB° u (a)] du 



In this subsection, we discuss the implementation of P 
the thermodynamic (T) and the direct (H) estimators a f 1 , r -, 

for the total energy. The T-method estimator is defined +2 J V l x + <7B u( a )\ B u{ a ) du 



as the following functional of the diagonal density matrix: 



(28) 



p(x; /3)dx 



(26) 



The above formula can be expressed as the statistical 
average 



/ eat Ir dx In dPlalXoo {x, a; (3)E^ (x, a; (3) 



J R dx J n dPlalXoo (x, a; (3) 



(27) 



provided that e~^ v ^ has (Sobolev) first order deriva- 
tives as a function of x. For a d-dimensional system, the 
expression of the T-method estimating function reads 



El } (x 1 ,...,x d ,d 1 ,...,d d ;f3) = 7713+ V [x! + axB^iai), . . . ,x d + a d B°' d (a d )] du 

Z P Jo 

+ -^V [x x + axB^iar), ...,x d + a d B° u d (a d )] } B /^) du. 



(29) 



The ensemble average energy can be obtained as the 
limit n — > oo of the sequence 



where 



Jn dx In dP[a]X n (x, a; f3)E^(x, a; (3) 
/ H dx f n dP[a]X n (x, a; (3) 



(30) 



+- / V 



V 



x + aBl ;n (a) 



du 



x + aB° a Ja) B° u Ja)du. (31) 



The finite-dimensional integral appearing in Eq. I|30[) can 
be evaluated by Monte Carlo integration. In the limit 
n — > oo, the variance of the estimator is finite because the 
square of E^(x, a; (3) given by Eq. (|2"T1) is a well defined 
function, the average value of which is finite for smooth 
enough potentials. 

A second energy estimator we employ in the present 
paper is the H- method estimator. This direct estimator 
is defined by the equation 



/ rp\H f® H x'P(x,x';P)\ dx 



L p(x; f3)dx 



(32) 



where the Hamiltonian of the system H x i is assumed to 
act on the density matrix through the variable x' . By 
explicit computation and some integration by parts, the 
H-method estimator can be expressed as the statistical 
average 



/E nH I R dx J n dP[a]X 00 {x,a;(3)Eg(x,a;f3) 

\ h l[3 = r j„ r jnrd v o\ \ i6 ) 



J m dx /jjtfHlco^a;)?) 



of the estimating function 2 



E^(x,d;(3) = - + V(x) + --- (u-rf 



2/3 ' ' w ' 4m J Q J ' 
xV'[x + aB° (a)}V'[x + aB° T {d)]dudT. (34) 



The H-estimator is properly defined even for potentials 
that do not have second-order derivatives. For a d- 
dimensional system, the H-method estimating function 
reads 
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,x d ,ai,...,ad;l3) = ^ + v ( x ^> 



_d_ 

dx 



0,i Jo Jo 



V [x x + aiB^iat), ...,x d + a d B^ d {a d )] j 
{^h + .^W,,^! a d B^ d {a d )] J du dr. 



(35) 



The reader should notice that the double integral ap- 
pearing in Eq. is really a sum of products of one 
dimensional integrals. Indeed, one easily computes 

X {/ v 2 V'[x + <rB%(a)]du^ |^ V'[x + aB°(a)]d 

h 2 p 2 < <- x 



2mo 



uV'[x + aB u {a)]du 



(36) 



The H-method estimator is the sum of the "classical" 
energy and a "quantum" correction term. Equation l|33(l 
shows that the total energy can also be recovered as the 
limit n — > oo from the sequence 



where 



J R dx J n dP[a]X n (x, a; (3)E^(x, a; (3) 
Ir dx J n dP[a]X n (x, a; 0) 



, (37) 



e2„2 rl r l 



E%(x,a;l3) = ±- + V(x) + ^ . 

2p 4m Jo Jo 

x V' [x + oBl n (a)} V [x + aB° T n (a)} du dr. (38) 

The forms of the T- and the H-method estimators derived 
here with the reweighted techniques in mind extend nat- 
urally to the TT-DPI method by means of Eq. ffity. One 
just replaces the one dimensional integrals appearing in 
Eqs. H31(l and (|38|l by appropriate trapezoidal quadrature 
sums. 

For the reweigthed techniques, we anticipate that the 
kinetic energy estimator entering the H-method estima- 
tor provides more accurate results than the kinetic en- 
ergy estimator entering the T-method estimator. As for 
the point and the path estimators of diagonal operators, 
the derivatives of the density matrix against the spatial 
coordinates, which measure fluctuations around the pref- 
erential points x and x' for which the reweighted density 
matrices are optimized, are expected to be reproduced 
in a better way than the temperature derivatives, which 
involve unoptimizcd path-averaged fluctuations. How- 
ever, for sufficiently low temperatures, the variance of 
the H-method kinetic energy estimator is expected to be 
larger than the variance of its thermodynamic counter- 
part. This larger variance is due to the factor 1 appear- 
ing in Eqs. lO and (J3SJ. 



There is one special property of the T- and H-method 
estimators that proves to be important in simulations. 
Let us notice that by virtue of the Bloch equation 

d 

H x ,p(x, x; P) = -—p(x, x'\ P), 

we have the equality 

(E) p := (E)» = (E) T . 

Here, the symbol := signifies that the average energy 
(E) p is defined to be the common value of the T-method 
and the H-method energy estimators, provided that these 
are equal. However, since p^ w (x, x'; f3) does not satisfy 
the Bloch equation (except for the free particle), in gen- 
eral 

H f M H x ,p™(x,x>;l3)\ x ^ x dx 

{E)f3,n = - 



J R p^(x;P)dx 
P 



RW (x;f3)dx 



and the T- and H-method estimators produce the same 
result only in the limit n — > oo. Given that the two 
energy estimators discussed in the present section can 
be computed simultaneously without incurring any com- 
putational penalty, we recommend that the agreement 
between the T- and the H-method estimators be used as 
a verification tool in actual simulations in order to check 
the convergence of various path integral methods. How- 
ever, we emphasize that the agreement between the T- 
and the H-method estimators is not a sufficient conver- 
gence criterion and in practice, the convergence of differ- 
ent ensemble averages with the number of path variables 
should also be monitored. 

As Eqs. H31JI and (|38|l show, the path and the point 
estimating functions for the potential energy enter nat- 
urally the expressions of the T- and H-method estimat- 
ing functions, respectively. For the purpose of using the 
agreement between the two energy estimators as a veri- 
fication tool for convergence, one should not replace the 
path estimating function for the potential energy in the 
expression of the T-method estimator with the point esti- 
mating function, even if this may improve the estimated 
energy. For special cases, as for instance the TT-DPI and 
RW-LCPI methods discussed in the previous subsection, 
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one may replace the point estimating function for the 
potential energy appearing in the expression of the H- 
method estimator with other estimating functions that 
produce the same value but have smaller variance. In 
this paper, we replace the point estimating function with 
the path estimating function for the TT-DPI method and 
with the estimating function given by Eq. i|25[) for the 
RW-LCPI method, respectively. 



III. A NUMERICAL EXAMPLE 

We have tested the relative merits of the T- and H- 
method energy estimators on a cluster of 22 hydrogen 
molecules at a temperature of 6 K, using three differ- 
ent path integral methods. Two of these methods, the 
trapezoidal Trotter discrete path integral method and a 
Levy-Ciesielski reweighted technique, have been already 
presented in the preceding section. The third method is 
a Wiener-Fourier reweighted (RW-WFPI) technique in- 
troduced in Ref. 0. The numerical implementation of the 
methods has been extensively discussed in Ref.|6jby some 
of us and are not reviewed here. 

The physical system we study has been recently ex- 
amined by Chakravarty, Gordillo, and Ceperleyi^ as well 
as by Doll and Freeman^ in their comparison of Fourier 
and discrete path integral Monte Carlo methods. The 
total potential energy of the (H 2 )22 cluster is given by 



N 



N 



V tot = Y,V LJ (r ij ) + Y,V c (r i ), 



(39) 



i<3 



where Vtj-(rjj) is the pair interaction of Lennard-Jones 
potential 



V L j{r lj ) = Ae LJ 



,12 / \ 6 
/ a L.J \ ( CTLJ 



and V c (ri) is the constraining potential 



V c {n) = e LJ 



|r t - R c 



211 



(40) 



(41) 



The values of the Lennard-Jones parameters olj and 6lj 
used are 2.96 A and 34.2 K, respectively^ R cm is the 
coordinate of the center of mass of the cluster and is given 
by 



1 



n 



i=l 



(42) 



Finally, R c — Agl.j is the constraining radius. The role of 
the constraining potential 14 (n) is to prevent molecules 
from permanently leaving the cluster since the cluster 
in vacuum at any finite temperature is metastable with 
respect to evaporation. 

At the temperature of 6 K and at the small densities 
employed in our computation, the molecules of hydrogen 



can be described by spherical rotational wave functions, 
because the majority of the molecules are in the J = 
state. To a good approximation, the molecules can be re- 
garded as spherical bosons interacting through isotropic 
pair potentials. However, a thorough study of parahy- 
drogen clusters has showed that quantum exchange of 
molecules is small at temperatures greater than 2 K and 
that the hydrogen molecules can be safely treated as dis- 
tinguishable particles^ 

The optimal choice of the parameter R c for the con- 
straining potential has been discussed in recent worki^ 
If R c is taken to be too small, the properties of the sys- 
tem become sensitive to its choice, whereas large values 
of R c can result in problems attaining an ergodic simula- 
tion. To facilitate comparisons, in the current work, R„ 
has been chosen to be identical to that used in Ref. Ilia . 
While this choice of constraining potential can induce 
ergodicity problems in calculations of fluctuation quan- 
tities like the heat capacity, we provide evidence below 
that the simulations in the current work are ergodic. 

The three path integral methods we have employed uti- 
lize different numbers of path variables for a given index 
n. For instance, the TT-DPI n-th order approximation 
to the density matrix p^ T (x, x'\ (3) utilizes n path vari- 
ables for each physical dimension, whereas p^ (x , x' ; (3) 
and p^ F (x,x';(3) utilize 4n + 3 and An path variables, 
respectively. To ensure fair comparison with respect to 
the number of path variables employed, we have tabled 
the total number of variables n v used for each physical 
dimension and not the index n. 



A. Sampling strategy 

We have discussed in Section II that the evaluation 
of the ensemble average of any observable eventually re- 
duces to the evaluation of the average of a certain esti- 
mating function against the probability distribution 



X n (x, a; [3)dx dP[a] 
J R dx f Q dP[a]X n (x, a; /?) 



(43) 



or its multidimensional counterpart. This probabil- 
ity distribution can be sampled with the help of the 
Metropolis algorithm, which comprises the following 



steps 



2i.25 



One initializes the imaginary-time paths in 



some fashion. Then, one attempts a trial move of the 
paths, which may involve changing several coordinates 
at a time. The displacement of the new paths is usually 
chosen to be relative to the old paths. To ensure ergod- 
icity, one makes sure that all variables of the system are 
eventually moved in a cyclic or a random fashion. The 
proposed path is then accepted or rejected with a certain 
probability. The average value of the quantity of interest 
is computed by averaging the values of the corresponding 
estimating function evaluated at the current paths. 

To establish some notation necessary for our discus- 
sion, for each vector = (x i7 t/j, Zj) denoting the physical 
coordinates of the particle i, we let a.; = {a^i, . . . , sn t n v } 
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be the collection of path variables associated with the 
respective particle. Each 

(x y z \ 
fl i,fc' a i,fc' a i,k J 

is itself a three-dimensional vector whose components de- 
note the k-th parameter of particle i for the x, y, and z 
coordinates, respectively. Going back to the description 
of the Metropolis algorithm, the full imaginary-time path 
has been initialized by choosing the physical coordinates 
n randomly in a sphere of radius R c centered about ori- 
gin. The path variables £4 have been initialized with 
zero. 

Except for the Wiener- Fourier method with n v = 512 
(n = 128), we update the individual particles one at a 
time in a cyclic fashion. Each update of a particle consists 
of an attempt to move the physical coordinate r j together 
with the first one quarter of the path variables a^ (that 
is, together with the variables {a,^; 1 < k < [n v /4]}) fol- 
lowed by a separate attempt to move the rest of the path 
variables associated with the particle i. Both the phys- 
ical coordinates and the path variables are moved in a 
cube centered about the old coordinates: 

t[ = n + A r (2u- 1) 

and 

a; ifc =ai, fc + A a (2u-l), 

where the three components of u are independent uni- 
formly distributed random numbers on the interval [0, 1]. 
Throughout our simulations, we have used the follow- 
ing maximum displacement values: A r = 0.26 A and 
A a = 0.15. The sampling technique employed guaranties 
an acceptance ratio between 30% and 70% for all meth- 
ods studied and for n v < 256. 

Because the acceptance ratio drops below 20% for the 
Wiener- Fourier reweighted technique with n v = 512, 
each most basic step of the previously described algo- 
rithm has been decomposed into two successive steps. 
The first step is decomposed into an attempt to move 
the physical coordinate together with the first 1/8 of 
the path variables a^, followed by an attempt to move 
the physical coordinate together with the next 1/8 
path variables a^. The second step is decomposed in a 
similar fashion; half of the remaining variables have been 
moved in a first attempt and then the other half in a sec- 
ond attempt. This restores the overall acceptance ratio 
to about 33%. In fact, we have monitored separately the 
acceptance ratio for the four different steps necessary to 
update all the coordinates associated with a given parti- 
cle and have made sure that the sampling is well balanced 
in the sense that the acceptance for each individual step 
is about 30% or larger. 

As a counting device, we define a pass as the mini- 
mal set of Monte Carlo attempts over all variables in the 
system. A pass consists of 2 ■ 22 = 44 basic steps for 
all simulations with n v < 256. For the Wiener- Fourier 



reweighted technique with n v = 512, a pass consists of 
4 • 22 = 88 basic attempted moves. One also defines a 
block as a computational unit made up of ten thousand 
passes. 



B. Embarrassingly parallel computation 

In order to achieve a statistical error of about 0.1 
K/molecule for all computed energies, we have employed 
a large number of Monte Carlo passes (10.4 million) and 
we have divided the computation in 16 independent tasks 
to be run in parallel. For the Wiener-Fourier reweighted 
method with n v = 512, we have utilized a number of 
40 million passes divided in 80 independent tasks. The 
Monte Carlo simulations are embarrassingly parallel pro- 
vided that one can generate independent streams of uni- 
formly distributed random numbers. In this situation, 
there is no need for communication among the different 
code replica running on different nodes, and the program 
is an ideal candidate for use on a distributed computing 
cluster. However, to be mathematically rigorous, it is 
necessary to ensure that all the communication needed 
is already buried in the independence of the streams of 
random numbers. This underlies the need for "good" 
parallel random number generators. 

The Mersenne Twister (MT) is a fast serial pseudoran- 
dom number generating algorithm with a long period and 
good fc-distribution properties^ Quite interestingly, the 
algorithm allows for the development of random num- 
ber generators meeting certain user specifications. For 
instance, one may specify the period (which must be a 
Mersenne prime number i.e., a prime number of the form 
2 P — 1), the word size, or the memory size. Given a 
specified period, one may still produce various algorithms 
which differ by their characteristic polynomials. The dy- 
namic creation of distributed random number generators 
is based on the hypothesis that MT random number 
generators having relatively prime characteristic poly- 
nomials produce highly independent streams of random 
numbers^ Because the laws by which the numbers are 
generated are significantly different, it is very probable 
that the streams produced by the different generators are 
highly uncorrelated. In this paper, we have used the Dy- 
namic Creator C-language library^ with the Mersenne 
number 2 3217 — 1. The library outputs streams of 32- 
bit integers, which are easy to convert into real numbers 
on the interval [0, 1]. Different streams are identified by 
different identification numbers. The streams have been 
initialized once at the beginning of the simulation with 
different seeds. 

Given the 16 streams of independent random numbers, 
the Monte Carlo simulation proceeds as follows. For each 
stream, one performs an independent simulation consist- 
ing of 65 blocks. These blocks are preceded by 13 equili- 
bration blocks, which are needed to bring the system into 
probable configurations but do not contribute to the aver- 
ages of the estimating functions. For the Wiener-Fourier 
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reweighted method with n v = 512, we use 80 indepen- 
dent streams of 50 blocks each, for a total of 40 million 
passes. The equilibration phase consists of 10 blocks for 
each stream. Ideally, the length of the individual streams 
should be chosen to be sufficiently large, that the aver- 
ages of the computed property for different streams are 
independent and normally distributed, as dictated by the 
central limit theorem. This requirement is satisfied by all 
simulations we have performed. 

We have collected individual averages for all blocks and 
streams and performed several statistical tests verifying 
the applicability of the central limit theorem as well as 
the independence between the block averages of same or 
different streams. Let {Zij : 1 < i < 16; 1 < j < 65} 
denote the block-averages of the property Z for stream i 
and block j (the RW-WFPI simulation for n v — 512 has 
been analyzed in a similar fashion). Under the assump- 
tion that the size of the blocks is large enough so that 
the correlation between different block-averages is negli- 
gible and under the assumption that the block-averages 
for different streams are highly uncorrelated, the values 
Zi .j should have a Gaussian distribution centered around 
the average value 

16 65 

Z = — — V V Z, , (44) 
16-65^^ J v ' 

i=l j = l 

with variance 

/ 16 65 \ 

y i=i j=i j 

The validity of this assumption can be verified with the 
help of the Shapiro- Wilks normality testi2& If the collec- 
tion of samples Zij does not pass the test, it does not 
necessarily follow that the samples Zi j are not indepen- 
dent, as their distribution is normal only if the size of the 
blocks is sufficiently large. At a significance level of 5%, 
we do not reject the Gaussian distribution hypothesis for 
all computed average properties. To within the statisti- 
cal significance of our calculations, the samples Zij can 
be assumed to be independent and have a Gaussian dis- 
tribution. 

A second set of tests consists in verifying that the 
row and column averages of Zi j have Gaussian distri- 
butions centered around Z with variances a 2 (Z)/65 and 
a 2 (Z)/l6, respectively. The validity of this distribution 
follows from the central limit theorem and the assump- 
tion that the samples Zij are independent and have a 
Gaussian distribution characterized by the average Z and 
the variance a 2 {Z). It is important to emphasize that the 
row averages must pass this test. As previously discussed, 
the number of blocks in a stream should be sufficiently 
large so that the row averages have the required distri- 
bution even if the independent samples Zij do not have 
a Gaussian distribution. Again, under the assumption of 
independence only, the row averages should have a Gaus- 
sian distribution centered around Z and have variance 



tr 2 (Z)/A^biocks for a sufficiently large number of blocks 
-^blocks- We have employed the Kolmogorov-Smirnov 
tesl^ to compare the distributions of the row and column 
averages with the theoretical Gaussian distributions. For 
all computed average properties, we find that the respec- 
tive distributions are identical at a statistical significance 
level of 5%. The agreement for the distribution of the 
row averages is evidence that the streams generated by 
the Dynamic Creator package are sufficiently indepen- 
dent, whereas the agreement for the distribution of the 
column averages is evidence that the block averages of 
the same streams are independent. 

For the third set of tests, we have considered two time- 
series {Z[, 1 < i < 16 • 65} and {Z'( ', 1 < i < 16 • 65} 
obtained by concatenating the rows of the matrix Zij 
and the columns, respectively. We then have studied the 
autocorrelation of the two time series for a maximum lag 
of 32. The correlation coefficients for a lag k < 32 are 
computed with the formula 

^ ^ 16-65 

r ' k = oHZ) 16~65 ^ ( ^ " ^ ^ Z[+k ~ ^ ' 

where Z' i+k = Z' i+k _ 16 . 65 if i + k > 16 • 65. Under the in- 
dependence hypothesis of the samples Z' il the statistics of 
the correlation coefficients for 1 < k < 32 is normal with 
average zero and standard deviation a' — 1/ Vl6 • 65. 
Moreover, the correlation coefficients can be regarded as 
independent samples of this normal distribution. By the 
binomial distribution, the probability that at most m cor- 
relation coefficients lie outside the interval [—2a', 2a'] is 
given by the formula 

^-fa^Ui^ -«)-*■ 

where q « 0.046 is the probability that a normal dis- 
tributed variable of mean zero and standard deviation 
a' lies outside the interval [—2a' ,2a'}. One computes 
P(3) = 0.942 and P(4) = 0.985 so at a level of signifi- 
cance of 5%, the hypothesis that the r' k are independent 
samples of a normally distributed variable of mean zero 
and standard deviation a' = l/\/16 • 65 should be re- 
jected if 4 or more correlation coefficients lying outside 
the interval [—2a', 2a'} are observed. 

The autocorrelation of the series Z[ is representative 
of the correlation between the block averages of same 
streams, whereas the autocorrelation of the time series 
Z'l is representative of the correlation between the blocks 
of similar rank corresponding to different streams. Fig.^ 
shows the correlograms of the two series for a RW-WFPI 
Monte Carlo simulation with n v = 32. The computed 
property is the H-method energy estimator. Both series 
Z[ and Z'l have only one point lying outside the interval 
[— 2a' , 2a']. These points are r' 5 and r'^, respectively (of 
course, the points r' = Tq = 1 arc not counted). Conse- 
quently, the simulation passes our third statistical test. 
In fact, all the simulations performed have passed this 
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FIG. 1: Correlograms for the time-series Z[ and Z". 
The property Z is the average ensemble energy computed 
by means of the H-method estimator using the RW-WFPI 
method with n v = 32. One notices that both the correlation 
between the block averages (r' k ) and the correlation between 
the streams (rj.') are negligible. 



statistical test for all computed properties. We conclude 
that the correlation between the block averages of same 
or different streams is negligible. By the central limit 
theorem, the statistical error in the determination of the 
average of the property Z is 



±2cr(Z)/Vl6 • 65, 



(46) 



where <r 2 (Z) is defined by Eq. (|45l) . (For the statistical er- 
ror, we employ the 2a value, corresponding to an interval 
of 95% confidence. The 5% probability that the results 
lie outside the confidence interval is chosen to agree with 
the level of significance of the statistical tests). 

The analysis performed in the present subsection 
demonstrates that the streams generated by the Dynamic 
Creator algorithm have negligible correlation at least for 
our purposes. 

A separate advantage in the use of indepen- 
dent streams is to overcome the phenomenon of 
quasiergodicity, 31 which might appear in Monte Carlo 
simulations whenever the distribution that is sampled has 
several well defined minima that are separated by walls 
of high energy. In this case, the random walker may be 
trapped in one of the wells and never sample the others, 
or sample them with the wrong frequency. The Monte 
Carlo simulation may pass all the aforementioned statis- 
tical tests but still produce the wrong results. For our 
system, the probability that such a situation may occur 
is quite low because the system is highly quantum me- 
chanical with strong barrier tunneling. Moreover, the 16 
independent streams have been initialized randomly in 
configuration space. This makes it unlikely that all the 
streams are trapped precisely into the same local min- 
imum or group of local minima. Evidence for quasier- 
godicity may be captured in the form of a few outlying 
averages among the stream averages. Such outlying av- 



erages have not been observed. 



C. Summary and discussion of the computed 
averages 

The computed averages for all methods and estima- 
tors utilized are presented in Tables III ITT1 and IIIII For a 
given number of path variables n v , the RW-WFPI, RW- 
LCPI, and TT-DPI methods utilize 2n v , 2.25n„, and n v 
quadrature points, respectively. [For a discussion of the 
minimal number of quadrature points and of the nature 
of the quadrature schemes that must be employed for 
the first two methods, the reader should consult Ref. 0. 
For the RW-WFPI method, we have utilized 2n v Gauss- 
Legendre quadrature points, though a number of 1.75n v 
points would have sufficed.] The observed overall com- 
putational time for the three methods have followed the 
ratios 2 : 2.25 : 1, even though the time necessary to com- 
pute the paths is proportional to v? v for the first method 
and to n v log 2 (n„) for the other methods. The computa- 
tion of the paths takes full advantage of the vector float- 
ing point units of the modern processors and is domi- 
nated by the calls to the potential, except for very large 
n v . 

As discussed in Ref. the asymptotic convergence for 
the reweighted techniques is expected to be cubic, even 
for the Lennard- Jones potential that is not included in 
the class of potentials for which cubic convergence has 
been demonstrated formally. We find that the asymp- 
totic convergence is attained only for very large fly , as 
one may see by comparing for example the total, poten- 
tial, and kinetic energies computed with the help of the 
T-method estimator for the RW-LCPI and the TT-DPI 
methods. Even if the latter method has only 1/n^ asymp- 
totic convergence, the two methods produce almost equal 
results. In fact, a numerical analysis of the relationship 



(E) 



f},n v 



(E) 



' 



const 



in which the left-hand side quantity is plotted against 
l/(n v ) a for different values of a, suggests that, while the 
methods have converged within the statistical error, none 
of the three methods includes sufficiently large values of 
n v to attain the ultimate asymptotic rate of convergence. 

When comparing the values of the H-method energy 
estimator and of the related potential and kinetic esti- 
mators for the three path integral techniques, one no- 
tices that the RW-LCPI technique provides better values 
than the TT-DPI method. The H-method estimator has 
a better behavior if used together with a reweighted tech- 
nique. This behavior is consistent with the analysis we 
have performed in Section II on the values of the poten- 
tial point-estimators and the excellent values found with 
the RW-WFPI method. For the reweighted techniques, 
the H-method estimator provides better energy values 
than the T-method estimator. This is also true of the 
potential and kinetic parts of the estimators. However, 
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the variance of the H-method estimator is significantly 
larger than the variance of the T-method estimator and 
the difference is even more pronounced if one compares 
the corresponding kinetic estimators. 

As discussed in Section II. A, the path estimator for 
the potential energy has a smaller variance than the 
point estimator. Indeed, the results from Table [I] show 
that the variance of the path estimator is approximately 
(0.9/0.6) 2 = 2.25 times smaller than the variance of the 
point estimator. In the case of the RW-LCPI and TT- 
DPI methods, we have employed the estimator given by 
Eq. l(23|) and the path estimator, respectively. These were 
shown to produce values identical to the point estimator 
but have the variance of the path estimator. For the RW- 
WFPI and RW-LCPI methods, the point and the path 
estimators produce different results. Due to the very de- 
sign of the reweighted techniques, we have argued that 
the point estimator results should be the more accurate 
ones. This theoretical prediction is well supported by the 
values presented in Tables HI and ILTI 

While we have argued that the H-method estimator is 
a better estimator as value (but not necessarily as vari- 
ance) than the T-method estimator for the reweighted 
methods, it is apparent from Table 1TTT1 that the same dif- 
ference persists for the trapezoidal Trotter scheme. As 
discussed before, for the TT-DPI method, the point and 
path estimators provide the same value for the average 
potential. As opposed to the reweigthed techniques, the 
H-method kinetic estimator is less accurate than the T- 
method kinetic energy estimator. Quite interestingly, 
even if individually the potential and the kinetic parts are 
more accurate for the T-method estimator, it is the H- 
method energy estimator that provides a more accurate 
total energy. Clearly, a strong compensation of errors 
appears in the case of the H-method estimator. Such a 
compensation of errors is generally characteristic of varia- 
tional methods. In this respect, notice that the TT-DPI 
density matrices are positive definite because they are 
obtained by Lie- Trotter composing a certain symmetri- 
cal short-time approximation. By the Ritz variational 
principle, the H-mcthod energy estimator cannot have a 
value smaller than the ground-state energy. Thus, the 
Ritz variational principle provides some control on the 
values of the H-method estimator, but not on the indi- 
vidual components, nor on the T-method estimator. The 
RW-LCPI density matrices are also positive definite for 
n > 2 and indeed, the energy provided by the H-method 
estimator is still better than what the values of the po- 
tential and kinetic parts suggest. While a final resolution 
awaits further study, it is apparent that this finding is not 
related to the asymptotic rate of convergence of the path 
integral technique. 

Among the three methods presented, the RW-WFPI 
has the fastest convergence for all properties studied. 
Moreover, for n v = 128 and n v = 256, there is a good 
agreement (within statistical noise) between the T- and 
the H-method energy estimators, as well as between their 
potential and kinetic energy components. For n v = 256, 



one concludes that the systematic error is smaller than 
the statistical error for all properties computed. An addi- 
tional RW-WFPI simulation with n v = 512 in 40 million 
Monte Carlo passes has produced results consistent with 
the findings above. The results are summarized in Ta- 
ble ]W\ and represent the energy values we report. 



IV. CONCLUSIONS 

In the present work we have considered a number of is- 
sues related to the choice of estimators for random series 
path integral methods. We have illustrated our results by 
applying them to the problem of computing various ther- 
modynamic properties of a model of the ^2)22 cluster 
using reweighted path integral techniques. The molec- 
ular hydrogen cluster is a strongly quantum mechanical 
system and is representative of the type of problems one 
is likely to encounter in many applications. Hence, it 
constitutes a useful benchmark for present and future 
path integral techniques and for this reason it is impor- 
tant that its physical properties be determined within 
advertized statistical error bars. Path integral methods 
capable of dealing with such highly quantum-mechanical 
systems in an efficient manner are needed, both for re- 
liable determinations of the physical properties of the 
respective systems as well as for accurate parameteriza- 
tions of the intermolecular potentials. 

We wish to make a number of points concerning the 
present results and the methods we have utilized to ob- 
tain them. At a more general level, we would like to em- 
phasize that the reweighted path integral methods dis- 
cussed here provide a broadly applicable, simple, and 
formally well characterized set of techniques. As demon- 
strated by the present results, they are capable of produc- 
ing high-quality numerical results for problems of appre- 
ciable physical complexity. Moreover, they do so with- 
out the assumption of a particular form for the underly- 
ing microscopic forces. Furthermore, the estimators de- 
scribed in the present paper are convenient, accurate, 
and easily implemented for any random series approach. 
As discussed in Section III, when used together, the T 
and H-method estimators provide an important consis- 
tency check on the quality of the path integral simula- 
tions. Such consistency checks are a valuable element in 
judging the reliability of particular simulations. 

As previously mentioned, the cluster application dis- 
cussed here provides a convenient test bed for the de- 
velopment of numerical methods. For this reason, we 
have exercised due diligence with respect to the qual- 
ity of our final results summarized in Table llVl As dis- 
cussed in Section III, we have subjected both the par- 
allel random number generator employed and the nu- 
merical results obtained to a series of quality-control 
tests. Beyond these statistical checks, it is important 
to note there is an internal consistency check on the 
quality of the present results. Specifically, as is ev- 
ident in Tables [I] |Hj and IIIII the kinetic, potential, 
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and total energies from the three different path inte- 
gral approaches (trapezoidal Trotter, reweighted Levy- 
Ciesielski, and reweighted Wiener-Fourier) all agree. It 
is also important to note in this context that, while the 
presently computed total energies agree with those re- 
ported by Chakravarty et al.^ the individual kinetic and 
potential energies do not. The kinetic energy reported 
by Chakravarty et al~ is approximately 0.8 K/particle 
higher than found in the present simulations (with the 
potential energy being correspondingly lower). The mag- 
nitude of this difference is well outside the statistical error 
bars involved and appears to signal a systematic error. 
Based on the observed consistency between the results 
produced by three different path integral methods and 
on the agreement between the T and H-method estima- 
tors for each of these path integral formulations, we feel 
confident of the results we have reported in Table llVl 

Note: After the present simulations had been com- 
pleted, we have learned from D. M. Ceperley that the 
off-diagonal pair density used as the starting point in the 
simulations reported in Ref. was truncated at first or- 
der in the expansion of off-diagonal displacements instead 
of second order and that the inclusion of this second order 
term resolves the kinetic and potential energy difference 
noted above. 
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APPENDIX A 



The main purpose of this section is to give a compact 
form for the integral 



/ dP^X^x' ,6; 0)O[x r (6) + aB° 9 (a)}, (Al) 
Jn 



where 9 is an arbitrary point in the interval [0,1]. In 
terms of a standard Brownian motion [see Eq. (I}], the 
above integral can be put into the form 



P [o-Bi = x'\aB =x]E 



-f3f* V(aB u )du 



0{aBg) 



oB\ = x , <jBq = x 



dyO(y)P [<jB 1 = x', aBg = y\aB = x] E 



-f3^V(aB u )du 



o~B\ = x' , aBg — y, ctBq = x 



(A2) 



Using the Markov property of the Brownian motion, one 
readily justifies the equalities 

P [aBi — x' , aBg = y\aBo = x\ 

= P [o-Bx = x'\aBg =y]P [a B B = y\aB = x] (A3) 
= Pfp{x, y, 9(3)p fp [y, x'\ (1 - 9)13] 



and 



E 



= E 

xE 



e ~l3^V(aB u )du 
~ e -l3f e V(aB u )du 
-Ull V(crB u )du 



oB\ = x' , aBg = y, a Bo = x 
aBg = y, aB = x 
aBi = x ', aBg = y 



(A4) 



Performing the transformation of coordinates u' = u — 
9 in the second factor of the right-hand side of Eq. (|A4|) 



and employing the invariance of the Brownian motion 
under time translation 

{aB u+ g\aBg = y,aB x =x',u> 0} 



{aB u \aB Q = y, aB^g = x',u> 0} 



one obtains 



E 



-0^V(aB^)du 



aB\ = x' , 



E 



-j3f°V(aB u )du 



y, aB = x 



aBg = y, aB Q = x (A5) 



xE 



-S3 Jg~ e V(aB u )du 



Let us focus on the term 



-/3j£v(<TB„)du 



aBx-g = x', aB = y 



aBg = y, aB = x 
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Performing the substitution of variables v! = u/9 and 
employing the scaling property of the Brownian motion 

{aB u g \ctBq — x, aBg = y, u > 0} 

= \o6 1/2 B u a8 1/2 B Q = x, a9 1/2 B 1 =y,u>o\, 

one proves 

E [e-^o V(<rB u )dn aBg = tf) aBfj = x 

= E \ e -^S l v(,e^B u )du\ e y2 Bi = y^ e i/2 Bo = x 

= p(x,y;6[3)/p fp (x,y;6f3). (A6) 

In a similar fashion, one demonstrates that 

E [e-^o- e n°B u)d u\ aBi _ g = x , aBo = y 

= p [y, x'; (1 - 9)f3] / Pfp [y, x'\ (1 - 6)0\ . (A7) 

We now combine Eqs. (STjl . l|A"2)) . (|A"3)l . tKfy . tKfty . 
and l)A7|l to obtain 

dPfflX^x, x', a; f3)0[x r {6) + aB a e {a)] 



= / p(x,y;60)p[y,x';(l - 6)/3}0(y)dy. (A8) 

With the help of Eq. I|A8J| and by cyclic invariance, 

/ dx I dP[d]X^(x,d) P)0[x + <xB° g (a)] 
Jr Jq 

= f dx f dyp(x,y;ep)p[y,x;(l-9)P}0(y) 

JR JR 

= f dyp(y,y;(3)0(y) (A9) 



dx / dP[a]X 00 (x,a;/3)0(a;). 
: Jn 



Moreover, since the function 0(x) is arbitrary, the last 
identity also implies that the random variables x and 
x + aBg (a) have identical distribution functions under 
the probability measure 



X OD (x, a; (3)dx dP[a] 
Jr dx In dP[°] X oc a; 0) ' 



By setting 0(x) = 1 in Eq. 
known product formula 



one obtains the well- 



p(x,x';l3)= / dP[a]X 00 (x,x' ,a; f3) 



= / p(x, y; 6p)p[y, x'; (1 - 9)0\dy, (A10) 



which is seen to be a consequence of some basic properties 
of the Brownian motion. 
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TABLE I: Listed are the results obtained by the Wiener- Fourier reweighted path integral method. Average potential (V)^, 
kinetic {K)p, and total energies (E)p are calculated with the help of the T- and H-estimators as functions of the number of 
path variables n v . The error bars are two standard deviation values. All energies are given in K/molecule. 



n v 


{E)'g 


/TtTB 

{E)g 


{V)'b 


7TTTT7 


Wis 


1 r r\ U 


4 


-57.66 ± 0.05 


-16.63 ± 0.18 


-82.14 ± 0.07 


-61.72 ± 0.12 


24.48 ± 0.02 


45.09 ± 0.15 


8 


-37.61 ± 0.05 


-17.77 ± 0.16 


-64.74 ± 0.06 


-53.07 ± 0.11 


27.13 ± 0.02 


35.29 ± 0.13 


16 


-25.68 ± 0.04 


-18.28 ± 0.13 


-54.27 ± 0.06 


-49.33 ± 0.10 


28.60 ± 0.03 


31.06 ± 0.11 


32 


-20.23 ± 0.04 


-18.00 ± 0.12 


-49.66 ± 0.06 


-48.05 ± 0.10 


29.42 ± 0.03 


30.05 ± 0.11 


64 


-18.29 ± 0.04 


-17.85 ± 0.11 


-48.19 ± 0.06 


-47.86 ± 0.09 


29.90 ± 0.03 


30.01 ± 0.11 


128 


-17.75 ± 0.04 


-17.64 ± 0.12 


-47.83 ± 0.06 


-47.81 ± 0.09 


30.08 ± 0.03 


30.17 ± 0.11 


256 


-17.71 ± 0.04 


-17.70 ± 0.12 


-47.85 ± 0.07 


-47.87 ± 0.10 


30.14 ± 0.03 


30.17 ± 0.12 



TABLE II: Listed are the results obtained by the Levy-Ciesielski reweighted path integral method. The format is that of Table 

in 



n v 


(m 




(vy 


(V)% 


(KY 


(K)1 


3 


-70.46 ± 0.06 


18.24 ± 0.20 


-93.47 ± 0.07 


-69.03 ± 0.09 


23.01 ± 0.02 


87.27 ± 0.19 


7 


-44.08 ± 0.05 


-10.81 ± 0.15 


-71.03 ± 0.06 


-55.08 ± 0.08 


26.94 ± 0.02 


44.28 ± 0.14 


15 


-29.84 ± 0.04 


-15.84 ± 0.12 


-58.33 ± 0.06 


-49.10 ± 0.07 


28.50 ± 0.02 


33.26 ± 0.12 


31 


-22.76 ± 0.04 


-17.40 ± 0.10 


-51.95 ± 0.06 


-47.83 ± 0.06 


29.19 ± 0.03 


30.43 ± 0.11 


63 


-19.50 ± 0.04 


-17.68 ± 0.10 


-49.15 ± 0.06 


-47.69 ± 0.06 


29.65 ± 0.03 


30.01 ± 0.11 


127 


-18.25 ± 0.04 


-17.68 ± 0.10 


-48.20 ± 0.06 


-47.80 ± 0.06 


29.95 ± 0.03 


30.11 ± 0.11 


255 


-17.84 ± 0.04 


-17.65 ± 0.11 


-47.93 ± 0.07 


-47.85 ± 0.07 


30.09 ± 0.03 


30.20 ± 0.12 



TABLE III: Listed are the results obtained by the trapezoidal Trotter discrete path integral method. The format is that of 
Table||] 



n„ 




(E)% 




(V)% 


m 




3 


-68.54 ± 0.05 


78.08 ± 0.30 


-89.88 ± 0.07 


-89.88 ± 0.07 


21.34 ± 0.02 


167.97 ± 0.32 


7 


-45.29 ± 0.05 


7.22 ± 0.19 


-70.88 ± 0.06 


-70.88 ± 0.06 


25.58 ± 0.02 


78.10 ± 0.21 


15 


-30.61 ± 0.04 


-12.52 ± 0.13 


-58.53 ± 0.06 


-58.53 ± 0.06 


27.92 ± 0.02 


46.01 ± 0.15 


31 


-22.95 ± 0.04 


-16.86 ± 0.11 


-51.99 ± 0.06 


-51.99 ± 0.06 


29.04 ± 0.03 


35.14 ± 0.12 


63 


-19.55 ± 0.04 


-17.66 ± 0.10 


-49.19 ± 0.06 


-49.19 ± 0.06 


29.65 ± 0.03 


31.53 ± 0.11 


127 


-18.29 ± 0.04 


-17.70 ± 0.10 


-48.27 ± 0.06 


-48.27 ± 0.06 


29.97 ± 0.03 


30.57 ± 0.11 


255 


-17.86 ± 0.04 


-17.71 ± 0.11 


-47.94 ± 0.07 


-47.94 ± 0.07 


30.07 ± 0.03 


30.23 ± 0.12 
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TABLE IV: Estimated energies in K/molecule for the 
(112)22 cluster computed with the help of the Wiener- Fourier 
reweighted technique using 512 path variables and 40 million 
Monte Carlo passes. Listed are the average potential (V)p, 
kinetic {K)p, and total energies {E)p calculated with the help 
of the T-method (left column) and H-method (right column) 
estimators. The reported errors are two standard deviations. 





-17.69 ± 0.02 | 


(E)» g 


-17.71 ± 0.06 


{V)l 


-47.82 ± 0.03 j 




-47.81 ± 0.05 


(K) T p 


30.13 ± 0.02 j 




30.10 ± 0.06 



